巴西专利BR112012010772A2 method and device for providing streaming media content, media content rendering method, and user te

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
METHOD AND DEVICE FOR PROVIDING MEDIA CONTENT FOR CONTINUOUS FLOW, METHOD OF RENDERING MEDIA CONTENT, AND, USER TERMINALMetadata defined decoding and rendering instructions for media content to be co-rendered in a media presentation are divided and distributed as track fragments (15,16) provided in different media container files (11). Track fragment fit information (20) is included in at least one track stretch (15) in order to define rendering synchronization relationships between the portions of media content defined by the track stretch (15,16) in a current media container file (11). Rendering synchronization relationships allow correct time alignment of the playback of the media content to be co-rendered to obtain a synchronized media presentation. Track fragment adjustment information (20) is particularly advantageous in connection with tuning or random access in a continuous stream of media container files (1,11) comprising fragmented metadata.
公开号:BR112012010772A2
申请号:R112012010772-0
申请日:2010-11-05
公开日:2020-09-08
发明作者:Per Fröjdh；Clinton Priddle；Zhuangfei Wu
申请人:Telefonaktiebolaget Lm Ericsson (Publ)；
IPC主号:

专利说明:

"METHOD AND DEVICE FOR PROVIDING MEDIA CONTENT FOR CONTINUOUS FLOW, METHOD OF RENDERING MEDIA CONTENT, AND, USER TERMINAL". TECHNICAL FIELD IS 5 The present invention generally refers to media content, and in particular Ss to allow rendering synchronized media on a user terminal.
BACKGROUND The Moving Image Experts Group (MPEG) has standardized the ISO-based media file format that specifies a general file format that serves as a basis for several more specific file formats, such as the 3GP file format. The file structure is object oriented and a file is made up of a series of objects called boxes. The structure of a box is deduced by its type. Some boxes only contain other boxes, while most boxes contain data. All data in a file is contained in boxes.
A file can be divided into an initial track, contained in a “moov” film case, and a number of incremental track fragments, contained in “moof” film fragment cases. Each track fragment extends the multimedia presentation in time. The film box and film fragment boxes are metadata boxes containing the information needed by a user or client terminal to decode and render the media presentation. The actual media data is stored in media data boxes of type “mdat”.
Track fragments make it possible to distribute metadata in large multiple blocks and thereby avoid the situation where the complete file structure needs to be known to the client at the beginning of reproduction or rendering. As the metadata is distributed as a sequence of track fragments, these track fragments can be created live during transmission, and / or chosen from different versions with different bit rates.
a The search in ISO / SGP files depends on the structure. When trail fragments are used, searching for positions is very difficult because the customer can only move a trail fragment at a time. The reason for this is that the customer needs to know the length of a track fragment to find the beginning of the next Ê track fragment.
In addition, when a media track is divided into fragments, the track fragments are unlikely to be perfectly aligned. This is at least partially due to the different sampling frequencies. For example, video can be sampled at a rate of 30 frames per second, that is, 33 ms between frames, and audio can be and grouped in units every 20 ms in length. Even in this simple example, it is obvious that the track fragments will rarely be aligned. This becomes problematic in random access, when the playback of a clip does not start at the beginning, for example, after a search. A track fragment can be e: requested, but it will be played out of sync because the customer has no way of knowing Y 5 the differences in track fragment lengths due to lack of knowledge of the. previous track fragment lengths. A trivial solution to this problem could be to give trail fragments an explicit time stamp. So the sync relationship between track fragments would be well known. But this would remove an important property from the trail fragments - the fact that they do not have explicit time stamps makes insertion of advertising, amendment of programs, etc. very simple. In fact, the addition of explicit timestamps to the trail fragments could put reservations on how the trail fragments could be used that are not present today. There is thus a need in the art for an efficient solution to allow search and random access to a continuous stream of media content that has been fragmented and defined by a sequence of fragments, but still achieving synchronized rendering. In particular, there is a need for such a solution that does not require the use of explicit time stamps for the trail fragments.
SUMMARY It is a general objective to allow a synchronized media presentation involving the co-rendering of media content. This and other objectives are provided by embodiments as described here. One aspect of the embodiments relates to a method of providing streaming media content. The method involves providing an initial track for each media content of at least a first media content and a second media content. The initial tracks are provided in a media container file and define the respective media contents. At least one track fragment is provided for each media content. The track fragments comprise the information defining Te 30 decoding and rendering instructions applicable to the respective media content portions of the relevant media content. At least one track fragment is further complemented with track fragment adjustment information that is applicable to the portion of media content defined by the track fragment. Track fragment fit information defines rendering synchronization relationships between the 3rd media content portion of the relevant media content and a corresponding media content portion of at least other media content to be co-rendered during a MT media presentation. and
The track fragment adjustment information thus defines the rendering synchronization relationships between the portions of media content defined by the track fragments. Rendering synchronization relationships allow alignment - correct, in terms of rendering time, of the different portions of media content * 5 to be co-rendered to obtain a synchronized media presentation. - Another aspect of the embodiments defines a device for providing media content A for streaming. The device comprises a track provider configured to provide an initial track for each media content of at least one first and a second media content in a media container file. A snippet provider provides at least one track snippet for each media content, where the track snippet defines decoding and rendering instructions applicable to a portion of the relevant media content. An information provider is implemented on the device to provide track fragment fit information on at least one track fragment associated with the first media content. The track fragment adjustment information is then applicable to the portion of the first media content defined by the track fragment.
Another aspect of the embodiments involves a method of rendering the media content. The method comprises requesting and receiving from a media container file comprising an initial track for each media content of at least one first and a second media content. The first track of the first / second media content defines the first / second media content. A subsequent media container file is also received. The subsequent media container file comprises a fragment of track for each media content. The track fragment comprises information defining decoding and rendering instructions for a portion of the media content. The subsequent media container file also comprises track fragment fit information contained in at least one track associated with the first media content and is applicable to the portion of the first media content as defined by the track fragment. The portion of the first media portion defined by its track fragment in the subsequent media container file and the corresponding portion of 30 of the second media portion defined by its track fragment in the subsequent media container file are co-rendered based on the fragments of track and based on track fragment adjustment information. The track fragment fit information then defines rendering synchronization relationships between the portion of the first media content and the corresponding portion of the second media content to obtain correct timing rendering of the portions of media content and a presentation synchronous media. eee eee, Yet another aspect of the embodiment defines a user terminal comprising a transmitter configured to transmit a request to a media container file comprising a respective starting track for each media content of at least one first and a second media content . A receiver is. provided on the user terminal receiving the media container file and a subsequent * 5 media container file comprising track fragments for the first and second x media contents and track fragment adjustment information. A media player is configured to co-render the portions of the first and second media content defined by the track fragments in the subsequent media container file. The media player conducts co-rendering based on the track fragments defining the decoding and rendering instructions and based on the track fragment adjustment information defining the rendering synchronization relationship between the portions of the first and second media content.
BRIEF DESCRIPTION OF THE DRAWINGS The invention, along with other objects and advantages thereof, can be better understood with reference to the following description made with the accompanying drawings, in which: Figure 1 is a flow chart illustrating a method of providing media content of according to an embodiment; Figure 2 is a flow chart illustrating additional steps of the method in Figure 1; Figure 3 schematically illustrates advantages of the embodiments over the prior art; Figure 4 is a flow chart illustrating an embodiment of the step of providing information in Figure 1; Figure 5 is a schematic illustration of the organization of the track fragment adjustment information according to an embodiment; Figure 6 is a flow chart illustrating additional steps of the method in Figure 1; Figure 7 schematically illustrates a film fragment with a starting phase or that of the encoder / decoder (codec) and a standard reproduction phase; Figure 8 illustrates an example of fragments with departure phases, where each ".30 departure is in overlap with the preceding fragment; Figure 9 is a schematic block diagram of a device for providing media content according to an embodiment ; Figure 10 is a schematic block diagram of an embodiment of the information provider in Figure 9; Figure 11 is a schematic block diagram of a device for providing media content according to another embodiment; AAA Figure 12 is a flowchart illustrating a method of rendering media content according to an embodiment; Figure 13 is a flowchart illustrating an embodiment of the co-rendering step in Figure 12; - Figure 14 is a flowchart illustrating additional steps of the method in Figure 12, and Figure 15 is a schematic block diagram of a user terminal in accordance with an embodiment.
Y DETAILED DESCRIPTION In all drawings, the same reference numbers are used for similar or corresponding elements. Embodiments are generally related to processing media content and in particular to techniques allowing synchronized rendering of media content in connection with fragmented media using track fragments to define media content. The embodiments complement the metadata contained in the track fragments with track fragment adjustment information that is applicable to portions of media content and define rendering synchronization relationships between the media content that must be co-rendered during a presentation from media. The track fragment adjustment information can thus be used by a user terminal as a synchronization base allowing synchronized rendering even during an access or random search procedure. Thus, the track fragment adjustment information allows the user terminal to access a stream of track fragments at any point and still be able, based on the track fragment adjustment information, to derive the rendering synchronization relationships even if the previous track fragments in the stream have not yet been received or processed by the user's terminal. The media content as described here relates to the media data that can be communicated to a user terminal to decode and render on it in order to provide a media presentation to a user. The media content can thus be content or video data that is played and displayed on a screen of Rest 30 monitor. Alternatively, or in addition, the media content can be content or audio data that is played and can be heard by a user using a loudspeaker. In a particular embodiment, at least one first media content and a second media content are co-rendered during a media presentation. One of the first media content and the second media content is therefore advantageously video content, while the other of the first media content and the second media content is typically audio content. Alternatively or in addition to this, a first media content represents the video data from a view on the left to 3D or stereo video, a second media content could then be the video data for the view on the right. In addition, a third media content could be the audio data that must be co-rendered with the 3D stereo video data. . According to the embodiments, the media content is advantageously * 5 in continuous flow to the user terminal. The embodiments are in particular. appropriate for use in connection with the Hypertext Transfer Protocol (HTTP) streaming of media content. The media content can, for example, be live media content generated or captured in connection with the streaming process. However, also previously generated media content, that is, not online, can advantageously be a continuous stream as described here. An additional variant is the so-called adaptive content that can be online or not online. The streaming of media content also includes variants of it, which provide streaming type delivery protocols, including, for example, progressively downloading 3GP files.
Figure 1 is a flowchart illustrating a method of providing media content for streaming. The method starts at step S1 which provides a respective starting track for each media content of at least one first media content and a second media content in a media container file. The media container file can be considered as a complete input package that preferably comprises, in addition to any media content per se, the information and instructions required by the user terminals to perform the decoding and rendering of the media content. The ISO-based media file format can advantageously be used as the file format for the media container file, including various derived storage formats or based on the ISO-based media file format, such as the file format Advanced Video Encoding (AVC). The AVC file format in turn specifies how H.264 (MPEG-4 AVC) is loaded in various file formats derived from the ISO-based media file format, for example, the format of the MP4 and 3GP file. The initial track provided in step S1 comprises metadata defining the content of the 30 particular media with which the initial track is associated. This metadata includes the information needed by a user terminal to decode and render the media content associated with the initial track. This means that metadata includes, but is not limited to, identifiers of relevant media content and information from the encoder / decoder (codec) employed to encode the media content.
According to the embodiments, metadata related to a particular media content is not provided as a single track or file. In clear contrast, the starting track for media content is complemented by at least one,
typically multiple, that is, at least two, track fragments that complement the metadata provided in the initial track. The next step S2 thus provides at least such a fragment of track for each media content. A track fragment then comprises - information and metadata defining decoding and rendering instructions for one: S portion of the media content of the relevant media content. Thus, the media content is * accumulated from a number of media samples, such as video or audio samples. In such a case, the track fragments may be applicable to different media samples of the media content. For example, the initial track could include general metadata, but it does not provide any specific decoding or rendering instructions applicable to particular media samples. A first fragment of track could then include the metadata applicable to media sample 1 ak, a second fragment of track includes metadata for sample media k + 1 am, a third fragment of track includes metadata for sample media m + 1 n and so on (k <m <n).
There is generally an advantage to fragmenting or splitting metadata into 1 multiple track fragments compared to using a single track with all metadata. Firstly, the streaming of live content can only be effectively accomplished if the metadata is provided in different track fragments that are generated and transmitted while the media content is generated. Otherwise, all media content must first be generated before the metadata can be generated and transmitted to the user's terminal. Fragmentation of metadata thus allows the transmission of media content samples and metadata applicable to these media samples as they are generated without having to wait for the completion of the generation of all media content. Second, the transmission of all metadata that relates to media content in a media presentation as a single track or file can achieve the transmission of a large amount of data. Thus, there would be a noticeable delay from requesting a particular media content until the user terminal receives and processes all metadata and is able to start decoding and rendering the media content.
Thus, the metadata applicable to the media content is divided above into pseo 30 different track fragments according to the embodiments.
The track fragments provided in step S2, or at least a portion of them, can be included in the same media container file as the initial tracks provided in step S1. However, it is generally preferred to group and organize the track fragments into one or more of the following media container files to thereby allow a transmission of the media container file with the initial tracks regardless of the transmission of the track fragments. ã A next step S3 of the method in Figure 1 provides the information for adjusting the track fragment in at least one track fragment associated with at least one of the first and second contents. Track fragment fit information is applicable to a portion of media content of the relevant media content and defines - rendering synchronization relationships between the portion of media content and a + Ss portion of corresponding media content of the other content media to be co-rendered with the media content portion during a media presentation. For example, track fragment fit information can be provided in step S3 in a track fragment associated with the first media content. The track fragment fit information then defines rendering synchronization relationships between a portion of media content of the first media content and a portion of corresponding media content of the second media content to be co-rendered with the content portion of the first media content.
A particular embodiment of step S3 provides track fragment fit information not only on a first track fragment associated with the first media content, but also on a second track fragment associated with the second media content. Therefore, the track fragment adjustment information in the first track fragment is applicable to a portion of media content of the first media content defined by the first track fragment, whereas the track fragment adjustment information in the second track is associated with the corresponding media content portion of the second media content defined by the second track fragment.
Track fragment adjustment information comprises synchronization information that can be processed at a user terminal to determine a synchronization relationship between the media content portions of the first and second media content to be co-rendered during the presentation from media. This means that the user terminal can use the synchronization information in order to correctly synchronize the beginning of the rendering of the respective media content portions to obtain a synchronous rendering of the media.
Figure 3 schematically illustrates the concept of providing trails, Eco 30 trail fragments and the initial trail fragment information. In the upper left portion of Figure 3, there are three parallel tracks of metadata applicable to different media content that can be co-rendered during a media presentation. For example, the first track could relate to video content encoded at a particular bit rate, the second track relates to audio content and the third track also to video content, but encoded at another bit rate . Alternatively, the first track could be video content from a left view, the second track “O could be video content from a right view for 3D or stereo video and the third track relates to audio content.
According to the embodiments, the metadata of the different tracks are divided into respective initial tracks 5, 6, 7 provided in a first media container file 1 and following fragments of track 15, 16, 17. The initial tracks 5, 6 , 7 are arranged It is 5 preferably in a film container ("moov") or box 2, which is a container for the S metadata that relate to the media content. Film container 2 typically also comprises a header container of film (“mvhd”) or box 3 defining global information that is independent of media, and relevant to the complete presentation of media considered as a whole. The figure also illustrates the media data container (“mdat”) or box 4 which is the container for the actual media data for the media presentation. Generally, box moov 2 sets up the basic structure of a media presentation preferably storing the most important information, such as track numbers, descriptions In the sample, the track identifiers that are important to make the media content data understandable and manipulable. In addition, it may contain only few or no samples in its track 5, 6,7.
The track formation is made to be able to start quickly and tune in at a certain point in time. For this concise or empty film box 2, samples are added consecutively to the following film fragments that are connected to film box 2 by the film extension box (“mvex”) (not shown). In this way, the user terminal can perform a quick start just by downloading the small size moov 2 box and requesting other metadata and media content data referenced by the track fragment in the course.
The upper right part of Figure 3 illustrates a subsequent media container file comprising the track fragments, but without any fragment adjustment, that is, according to prior art techniques. In such a case, all track fragments will be interpreted by a user terminal as having the same starting time even if the length of the initial tracks 5, 6, 7 preceding the track fragments is not the same length in time. In the illustrated example, the initial 30 track fee 6 for the second media content has a much shorter duration than the initial tracks 5, 7 for the first and third track content. This means that the track fragment of the second media content must actually be started before the track fragments of the first and third track fragments. However, in order for the user terminal to be aware of this difference in the start time of the trail fragments, the user terminal must access the flow at the beginning of the flow and then process the initial trails 5, 6, 7 and each trail fragment next to the current set of EA - trail fragments. If the user terminal conducts, instead, a random access or a search in the stream, it will not have this previous information of the respective durations of the preceding track fragments. The user terminal will thus assume that the fragments of the track in the current media container file must have the same starting time as indicated at the top right of Figure 3. The result will be a: 5 non-synchronous rendering of the media, where , for example, the audio playback will be "out of phase with the rendering of the corresponding video.
o The lower right part of Figure 3 illustrates an embodiment in which track fragment adjustment information 20 is provided in at least some of the track fragments 15, 17. Track fragment adjustment information 20 provides instructions and information to the user terminal to determine which different fragments of track 15, 16, 17 must have different start times. This means that track fragment 16 for the second media content must start ahead of track fragments15, 17 of the first and third media content in the illustrated example.
The track fragments 15, 16, 17 and the track fragment adjustment information are preferably included in a subsequent media container file 11. In similarity to the first media container file 1, the subsequent media container file preferably comprises a container film fragment (“moof)” or box 12, which is a container for the metadata related to the subsequent portions of media content. The film fragment container 12 typically comprises 20 also a film fragment header container ("mfhd") or box 13. The film fragment header box 13 typically contains a sequence number, as a security check. The sequence number usually starts with | and increases for each track fragment in files 1, 11, in the order in which they occur. This allows readers to check the integrity of the sequence of the media container files 1, 11. The figure also illustrates the media data container14, which can be included in the subsequent media container file 11.
Track fragment adjustment information 20 of the embodiments allows; the synchronized rendering of the media content by a user terminal providing the information and instructions necessary to identify the synchronization relationships between the Ss 30 track fragments, 15, 16, 17.
Metadata in the form of track fragment adjustment information 20 is thus added to track fragments 15, 16, 17 describing synchronization operations. This can be done, as is discussed here, in the form of a new box that gives the relationship between the media time of the samples in a fragment of track 15, 16, 17 and a hypothetical timeline. This metadata can be redundant if a clip is decoded from the beginning or a preceding moof box 12 in a row is decoded. The creator of the media container file 11 provides this information as a substitute for the synchronization information that would be gained by decoding the file flow from the beginning. The user terminal can then use this information to calculate the synchronization ratio between tracks in a search. . The figure illustrates a particular case when not all fragments of track 15, * 5 16, 17 need to be assigned and comprises the adjustment information for fragment of ç track 20. In another embodiment, each fragment of track 15, 16, 17 in the subsequent media container file 11 comprises respective track adjustment information
20. Track fragment fit information from the embodiments can be provided in all track fragments of the first and second media content in step S3 of Figure 1. This means that if the media content is provided as a stream of media container files, track fragment fit information can be inserted into the track fragments of all subsequent media container files. A user terminal that accesses the stream from the beginning, starting with the media container file with the initial tracks, and continuing to receive all subsequent media container files, does not require track fragment adjustment information. In clear contrast, the user terminal can determine the synchronization relationships between the track fragments based on the respective durations of the previously received track fragments. In such a case, the user terminal may simply not consider the track fragment adjustment information. However, a user terminal that accesses the flow at some point after starting the flow, such as in connection with a search operation, will not know, according to the prior art, relative departure times for the trail fragments. The reason for this is that the user terminal has no prior knowledge of the respective durations of trail fragments passed in the stream. In such a case, the user terminal may use the track fragment adjustment information provided in the subsequent media container file in order to achieve synchronization between rendering of the media contents ii defined by the respective track fragments. The track fragment setting information in any subsequent media container files can be neglected by the user terminal as long as it then knows the time relationships between track fragments based on the respective durations of the track fragments in the track container file. preceding media. In an alternative embodiment, the trail fragment adjustment information is provided in response to a request originating from a user terminal. Figure 2 illustrates this concept schematically. The method starts at step S10 in which a random access request is received. The random access request states “what does a user terminal intend to access the media stream at an intermediate point,
that is, not receiving and rendering since the start of the media stream. The method then continues to step S1 of Figure 1, in which the media container file with initial tracks is provided. This media container file is transmitted in step S11 to the requesting user terminal. Although, media content can be streamed continuously to: 5 the user terminal, this initial media container file can be sent out of stream,. as downloaded to the private user terminal. The media container file: then comprises the initial metadata required by the user terminal in order to start the decoding configuration and be ready to receive, decode and render the media content. The method continues to steps S2 and S3 in Figure 1, where the track fragments and the track fragment adjustment information are provided. The following step S12 introduces the track fragments and track fragment adjustment information in a subsequent media container file that is continuously transmitted to the user's terminal. As new media content must be defined, new track fragments and new track fragment fit information are preferably provided and included in subsequent media container files that are streamed to the user's terminal, which is schematically illustrated by the line L1. In a particular embodiment, each subsequent media container file comprises a fragment of track by media content and per initial track in the first media container file.
It is also possible to omit the addition of any track fragment adjustment information in subsequent media container files except the first subsequent media container file streamed to a user terminal. The user terminal generally has no need to adjust the track fragment in the formation on the other subsequent media container files as long as it can correctly align the track fragment in these files based on the length of the track fragments in the file (s) ( s) subsequent media container (s) previously received.
past The streaming of the subsequent media container file to the user's terminal, in a particular embodiment, is conducted using HTTP. At 30 HTTP is expected to be used as the main protocol for the distribution of multimedia data. Currently, a packet-switched streaming service already supports it to some extent using the 3GP file format. An advantage of HTTP as a streaming protocol instead of the traditional streaming protocol, Real-time transport protocol (RTP), is that most Proxy servers + firewalls support HTTP, while not all proxies & firewalls can correctly handle RTP.
EO The separate transmission of the first media container file in step S1 and the continuous stream of subsequent media container files with track fragments and track fragment adjustment information in step S12 can also be implemented in embodiments where they are not received access requests. random or triggers of the provision of track fragment adjustment information. The stream of media container files can then be in the form of. initial media.3gp file comprising the initial tracks and the moov box. This file ! initial media container is then followed by the subsequent media container files comprising the track fragments and the moof box. media 001.3gs, media 002.3gs, media 003.3gs and so on. Figure 4 is a flowchart illustrating an embodiment of providing track fragment adjustment information. The method continues from step S2 of Figure 1. A next step S20 defines a number of adjustment segments in the media content portion for which the track fragment adjustment information applies. The presentation of media content during which the media content portion is to be rendered is thus divided into one or more adjustment segments so that time adjustments can be defined.
The following steps S21 to S23 of Figure 4 are then preferably conducted for each adjustment segment defined in step S20 for a particular track fragment, which is illustrated schematically by line L2. Step S21 determines a length of time for the adjustment segment. The length of time preferably defines the duration or extent of the particular adjustment segment in units of a timescale associated with the media presentation, as provided in the film header box (see Figure 3) or elsewhere in the file media container.
A next step S22 determines a time indication for the setting segment. The time indication represents a start time of the adjustment segment within the media content portion of the relevant media content. Alternatively, the time indication represents an empty time indicating that no rendering of the media content should occur for the duration of the adjustment segment. An optional next step S23 determines a rate indication for Fe: 30 adjustment segment. The rate indication represents a relative rendering rate to match the media content to the adjustment segment.
The circuit of steps S21-S23 can be run in series for different adjustment segments or in parallel.
The next step S24 defines the track fragment adjustment information so that the particular track fragment comprises the number of adjustment segments defined in step S20, the length of time for each adjustment segment of step S21, the indication of EEEEaas time for each adjustment segment of step S22 and optionally the rate indication for each adjustment segment of step S23. Referring to Figure 5, in a particular embodiment, a track fragment is defined as a track fragment box or container (“traf”) 15. The. trail fragment 15 preferably comprises a trail fragment or container header box (“tfhd”) (not shown) and optionally one or more boxes: execution of trail fragment or containers (“trun”) (not shown). Track fragment box 15 is a container for a track fragment adjustment box (“tfad”) 25. Track fragment adjustment box 25 can be defined, in line with the ISO-based media file format , such as: Box type: “ttad” Container: Track fragment box (“traf”) Mandatory: No Quantity: Zero or one The track fragment adjustment box25, if present, is advantageously positioned after the box track fragment header and before the first track fragment execution box.
The track fragment adjustment box 25 is then a container for the track fragment adjustment information 20. aligned (8) class TrackFragmentAdjustment box extends Box (tfad ') à In a particular embodiment the fragment adjustment box Track 25 is a container for the track fragment media container or container (“tfíma”) 20 that provides track fragment adjustment information, as in the form of explicit timeline deviations: Type of box : Last Container: Track fragment adjustment box (“trad”) Required: No Quantity: Zero or one aligned (8) class TrackFragmentMediaAdjustmentBox extends FullBox (tfma ', io version, 0) (unsigned int (32) entry count; jo 30 for (i = 1; i <= entry. count; i ++) if (version == 1) (unsigned int (64) segment duration; int (64) media time;) else [// version == 0 unsigned int (32) segment duration; int (32) media time;
int (16) media rate integer; int (16) media rate fraction = 0; ) ") The Version parameter is an integer that specifies the version of the Tr track fragment media 20 checkbox, typically O or 1. In a particular embodiment, IS version 1 has 64 bits per element field, while the version has 32 bits per element field. The entry count parameter 21 is an integer that gives the number of entries in the track fragment media adjustment box 20 and corresponds to the number defined in S20. is an integer that specifies the duration of the adjustment segment, preferably in units of the time scale in the movie header box Segment duration 22 was determined in step S21 of Figure 4. The parameter media time 23 is an integer containing the departure within the media of the current adjustment segment, preferably in units of the media timescale, at the time of the composition. If this field is set to a predefined value, preferably -1, it is an empty edition. t The track should preferably never be in an empty edition. Any difference between the duration in the film header box and the track duration is preferably expressed as an implicit empty edit at the end. Media time 23 was determined in step S22 of Figure 4. The parameter media rate 24 specifies the relative rate at which to touch the media corresponding to this adjustment segment. If this value is equal to a predefined value, preferably O, then the setting is specifying an interrupt. This means that the media at media time 23 is presented for the duration of segment 22. Otherwise, this field is set to another predefined value, preferably 1. The media rate 24 was determined in step S23 of the Figure
4. The syntax of the track fragment adjustment information is compatible with ISO-based media file format edit lists. This way is straightforward to transform Ei fragment settings for edit lists in the event that the track fragment media is not a simple continuation of media on previous track fragments, but when E 30 the track fragments are sent after insertion other fragments, such as advertising, amendments, etc., or as a tuned adjustment after a search operation. In an alternative embodiment, explicit flagging is included, either in the track fragments themselves or elsewhere in the media container file, in order to indicate whether the track fragment settings are imperative to interpret or when they can be conditionally ignored . Tr Track fragment adjustment information is generally ignored when they are signaled to be ignored and under the condition of normal playback, that is, consecutive playback of track fragments without searching, tuning, etc.
In this way it is possible to distinguish between cases when the media fragment is a simple continuation in one. sequence of consecutive fragments or if you need media adjustments due to the 7 5 compositions, amendments, advertisement insertion, tuning, etc. * In order to illustrate how the preferred parameters of the À fragment fragment adjustment information can be used in an example of real implementation, a situation with HTTP streaming of the video and audio data is assumed.
In such a case, the moov box (see Figure 3) comprises a video track and an audio track and each moof box (see Figure 3) in subsequent media container files comprises a video and an audio traffic.
Supposing that video traf defines 60 s of video that must be played as: 1) wait 10 s before starting video rendering; 2) turn on the video and play for 40 s; 3) - pauses for video renders for 20 s (take a break) - for example to insert advertising; and 4) the remaining 20 seconds of the video.
The track fragment adjustment information for the video traf can then be defined as: Input count = 4 1st adjustment segment Segment duration = 10s Media time = -1 Media rate = 1 Media time 23 has here the default value of -1 representing a void edit and indicating that no rendering of the video should occur during the pi duration of 10 s of the first adjustment segment. 2nd adjustment segment Ae Segment duration = 40 s Media time = Os Media rate = 1 Media time 23 is set to zero to indicate that the video rendering should start from the start of the video samples defined by the video traf chain. 3rd adjustment segment Segment duration = 20s rrTO Media time = 40 s O
Media rate = Media rate 24 is equal to zero here to set an interrupt indicating a pause in video rendering for 20 S.
: 4 adjustment segment "5 Segment duration = 20 s Media time = 40s Media rate = 1 Media rate 24 has now switched back to 1 to indicate that video rendering should be continued. In addition, media time 23 indicates that the video rendering should start at the position where the rendering in the second adjustment segment has ended.
The track fragment adjustment information thus provides rendering synchronization relationships that allow for synchronized rendering between different media contents to be co-rendered. Track fragment adjustment information can, for example, set a delay in rendering the portion of media content for a specified period of time by setting the media time = -1 and segment length equal to the specified time period. It might also be possible to set an advance in rendering the portion of media content using another preset media time value. The insertions of commercials and amendments of programs are possible by providing the information for adjusting the fragment of the track that defines a pause in the rendering of the media content, that is, setting the media rate = O.
Figure 6 is a flowchart illustrating additional, optional steps of the method in Figure 1. The method continues from step S2 in Figure 1. A next step S30 retrieves a timestamp from the first media sample in the media content portion of the first media content and the time stamp of the first media sample on the corresponding media content portion of the second media content.
A next step S32 calculates a time deviation based on the IgE time stamps retrieved in step S30 and this time deviation is used in step S33 to determine the track adjustment information.
30 30 In a particular embodiment, step S32 involves finding the minimum or maximum value of the time stamps retrieved in step S30. A respective time deviation is then calculated between the time stamps recovered in step S30 and the minimum or maximum value. The calculated deviations are used in step S33 to configure the track fragment adjustment information.
In one embodiment, the method also involves translating, in step S31, the time stamps (T1, T2, T3, ...) recovered in step S30 to a common Eee time scale, such as the time scale in film header box, to get translated time stamps (T1 ', T2', T3 '....). The reason for this translation of the timestamp could be that the timescales of the retrieved timestamps are very unlikely to be equal due to different sample rates for media content, such as 30: frames per second for video and 50 frames per second for audio. f 5 Step S32 then involves finding the minimum value of the timestamps “and translated TyNv = MIN (T1 ', T2', T3 ', ...) or the maximum value of the translated time stamps Tumax = MAX (T1 ', T2', T3 ', ...). The deviations are then calculated as: Offset1 = T1'- Tuw or Offset1 = TuaT1 'Offset2 = T2'- Tuw or Offset2 = TuaT2' 10 ..
In this approach, different track fragments with different starting times can be perfectly aligned. In addition, the self-integrity of the trail fragments inside the traf box is preserved as long as all adjustments can be defined regarding the trail fragment that has the minimum or maximum value of the time stamp inside the traf box.
Generally, video will not be synchronized with audio in most cases and the misalignment could be even greater if capture devices and encoding restrictions were considered. For example, the audio data could be fragmented at a point in time when the corresponding video sample was a B image that could not be randomly accessed. In such a case, the video track fragment has no choice, just go to the image | forward or reverse direction. This can result in some obviously longer or shorter trails.
In such a scenario, even these tracks can be aligned with the help of the track fragment adjustment information, preferably included in a track fragment media adjustment box. It could also be annoying at first when not all tracks have content, for example, a 3D movie fragment with only one view left at the beginning. It would be desirable for those missing content on the short tracks to be amended by the data in the preceding track fragment or extra content on the long tracks to be trimmed. This in principle can be done given the delay deferral information 30 provided by the track fragment media adjustment box, that is, by the track fragment adjustment information. The spliced full track fragment helps to improve the user experience in fragment tuning and is also useful in many other scenarios, since it now transforms into a complete presentation, for example, transcode a media fragment into a standalone movie.
Track fragment adjustment information can also be used to manipulate overlapping track fragments, typically film fragments. This scenario is illustrated in Figures 7 and 8. In such a case, the track fragment adjustment information is always preferably used by the user terminals and not only during search or tuning operations. The reason for having overlapping fragments may be that some encoders may have a certain start time before the quality. reach the desirable level, see Figure 7. In this embodiment, the starting time of is 5 a fragment is overlaid with the previous fragment, see Figure 8. The track fragment adjustment information can then be used to indicate the start phase and inform the 'encoder from which point the standard reproduction phase starts.
Figure 9 is a schematic block diagram of a device 100 for providing the media content for streaming. Device 100 comprises a track provider 120 configured to provide a respective starting track for each media content of at least a first media content and a second media content. The initial tracks define the respective media contents and are provided in a media container file by track provider 120. A fragment provider 130 is implemented in device 100 and is configured to provide at least one respective track fragment for each of at least two media content. These track fragments define the decoding and rendering instructions for different portions of the first media content or the second media content. In a particular embodiment, fragment provider 130 provides at least one, preferably one, such a fragment of track from the first media content and at least one, preferably a fragment of track from the second media content per subsequent media container file .
Device 100 also comprises an information provider 140 configured to provide track fragment fit information on at least one track fragment associated with the first media content. The track fragment adjustment information is then applicable to the media content portion of the first media content defined by the track fragment. In a particular embodiment, a first track fragment adjustment information is provided in a first track fragment defining decoding and rendering instructions for a first portion of the media content of the first media content. A second track fragment adjustment information is similarly provided by information provider 140 in a 30 second track fragment defining decoding and rendering instructions for a second portion of media content from the second media content. In such a case, the first and second track fragments and thus the first and second track fragment adjustment information are advantageously provided in the same subsequent media container file.
The track fragment adjustment information defines rendering synchronization relationships between the first portion of media content of the first media content and AND eETEEE the second portion of media content of the second media content. The first and second portions of media content are still intended to be co-rendered during a media presentation.
The device 100 for providing the media content preferably also comprises a transmitter 110 or the general output unit for transmitting the media file * 5 to a requesting user terminal. Transmitter 110 is preferably: configured to continuously stream subsequent media container files to the user terminal and more preferably to continuously stream subsequent media container files using HTTP as the streaming protocol.
Information provider 140 of device 100 can be configured to provide track fragment fit information on all track fragments. In an alternative embodiment, information provider 140 provides track fit information on at least one track fragment, preferably all track fragments from a subsequent media container file, which will be the first subsequent media container file a user terminal tuning in for streaming will receive. In such a case, device 100 preferably comprises a receiver 110 or general input, such as implemented as a transceiver or a common input and output unit, for receiving a random access request from the user's terminal. The information provider 140 then provides track fragment fit information on the track fragment (s) in response to the received random access request.
Units 110 through 140 of device 100 can be implemented or provided as hardware or a combination of hardware and software. In the example of a software-based implementation, a computer program product implemented on device 100 or a portion thereof comprises software or computer program execution on a general purpose or specially adapted computer, processor or microprocessor. The software includes the code elements of the computer program or portions of the software code illustrated in Figure 9. The program can be stored in whole or in part, or on one or more non-computer readable media and appropriate transients or storage media data such as magnetic disks, CD-ROMs, DVD disks, USB memories, hard disks, optical magnetic memory, in RAM or volatile memory, in ROM or flash memory, as firmware, or on a data server.
The device 100 can advantageously be implemented within or connected to the media server present in or connected to a network interconnection point or base station of a radio-based communication network.
Figure 10 is a schematic illustration of an embodiment of information provider 140 in the device of Figure 9. Information provider 140 comprises a number definer 141 configured to define a number of adjustment segments for the media content portion. relevant defined by a track fragment in which the track fragment adjustment information will be inserted.
A duration determiner 141 is implemented to determine a respective time duration for each segment. adjustment.
The information provider 140 also comprises an indication determiner. 5 of time 143 configured to determine a respective time indication for each adjustment segment.
The time indication represents a start time of the adjustment segment within the relevant media content portion or represents a so-called empty time.
In the latter case, the empty time indicates that no rendering of the media content should occur for the duration of the adjustment segment.
An information definer 145 generates the track fragment information to understand the number of adjustment segments defined by the number definer141, the time durations determined by the duration determiner 142 and the time indications determined by the time indication determiner143. In an optional embodiment the information provider 140 also comprises a rate indication determinator 144. The rate indication determiner is configured to determine a respective rate indication for each adjustment segment.
The rate indication represents a relative rendering rate for the portion of media content that corresponds to the adjustment segment.
For example, the rate indication can indicate a pause in rendering or define that rendering should proceed normally.
The information definer 145 is in this embodiment configured to generate the track fragment adjustment information to understand the rate indications determined by the rate indication determiner 144 in addition to the number of adjustment segments, the time durations and indications of time.
Units 141 145 of information provider 140 can be implemented or provided as hardware or a combination of hardware and software.
In the example of a software-based implementation in, a computer program product implementing information provider 140 or a part of it comprises. software or a computer program run on a general purpose or specially adapted computer, processor or microprocessor.
The software includes the “30 code elements of the computer program or the code portions of the software illustrated in Figure 10. The program can be stored in whole or in part, or on one or more appropriate non-transitory computer-readable media or media. data storage such as magnetic disks, CD-ROMs, DVD disks, USB memories, hard drives, magneto-optical memory, in RAM or volatile memory, in ROM or flash memory, as firmware, or on a data server.
Figure 11 is a schematic block diagram of another embodiment of the EE device 100 for providing the media content for streaming.
In addition to the units previously described above in connection with Figure 9, this embodiment of device 100 comprises a time stamp retriever 150 configured to retrieve a time stamp from the first media sample in the first content portion. media content of the first media content and the first media sample in the second portion of media content of the second media content. A drift calculator 160 is implemented to calculate a time drift based on retrieved timestamps. Ô The device 100 then comprises an information determiner 180 configured to determine the track fragment adjustment information based on the time offset calculated by the offset calculator 160.
In a particular embodiment, the time stamp retriever 150 is configured to retrieve respective time stamps from the first media samples for each media content present in a subsequent media container file. An optional value finder 190 is configured to identify the minimum or maximum value of the retrieved time stamps. The offset calculator 160 calculates a respective time offset for each media content as the difference between the minimum or maximum value from the value finder 190 and the time stamp retrieved by the time stamp retriever 150 for the media content. The information determiner 180 then determines the track fragment adjustment information based on the calculated time offsets.
In a particular embodiment, an optional time stamp translator 170 first translates the time stamps retrieved by the time stamp retriever 150 to a common time scale to form the corresponding translated time stamps. The calculation of the time deviations by the deviation calculator 160 can then be conducted based on the translated time stamps. In one embodiment, the value finder then finds the minimum or maximum value of the translated time stamps and the deviation calculator 160 calculates the time deviations between the minimum or maximum value and the respective translated time deviations.
Units 110 to 190 of device 100 can be implemented or provided as hardware or a combination of hardware and software. In the case of a software based implementation, a computer program product - 30 implementing device 100 or a portion thereof comprises software or a computer program execution on a specially adapted general purpose computer, processor or microprocessor. The software includes the computer program code elements or portions of the software code illustrated in Figure
11. The program can be stored in whole or in part, or on one or more appropriate non-transitory computer-readable media or data storage media such as magnetic discs, CD-ROMs, DVD discs, USB memories, hard drives , AAA magneto-optical memory, in RAM or volatile memory, in ROM or flash memory, as firmware, or in a data user.
Device 100 can advantageously be implemented or connected to the media server present at or connected to a network interconnection point or station. basis of a radio-based communication network. "5 Figure 12 is a flow chart illustrating a method of rendering content from: media based on track fragment fit information.
The method starts at the IS40 step in which a media container file is requested and received.
This media container file is usually a first media or initial container file and comprises the initial tracks defining at least first and second media content to be co-rendered in a media presentation.
The request can advantageously be a random access request for media content streamed over HTTP.
A next step S41 receives a subsequent media container file.
This subsequent media container file is preferably an intermediate media container file in a continuous stream of multiple media container files.
Thus, the reception of the subsequent media container file S41 is typically due to a tuning or search operation in the streaming file.
The subsequent media container file comprises a track fragment for each media content, that is, at least the first and second media content.
The track fragment for a particular media content comprises information defining decoding and rendering instructions applicable to a portion of the particular media content.
In addition, the track fragment also comprises track fragment adjustment information according to the embodiments.
In a particular embodiment, only one of the track fragments in the subsequent media container file, such as the track fragment associated with the first media content, comprises track fragment adjustment information.
Another embodiment provides track fragment fit information in multiples, typically all track fragments in the subsequent media container file. : The next step S42 co-renders the respective portions of the first and the second media contents defined by the track fragments in the container file of the subsequent media.
This co-rendering in step S42 is performed based on the track fragments and based on the track fragment adjustment information that defines the rendering synchronization relationships between the respective media content portions.
Track fragment adjustment information thus allows correct time alignment of the respective media content portions to obtain synchronized co-rendering, where, for example, audio data will be rendered synchronously with video data or video data. video viewed on the left and right will be synchronized and correctly aligned in time to achieve 3D or stereo video preferably along with synchronized audio playback.
In the continuous media session, other subsequent media container files are typically received, which is illustrated schematically in Figure 14. The method. continues from step S42 of Figure 12. A next step S60 receives a second file. Subsequent media container only. If it understands the track fragment adjustment information that the information can be ignored and a synchronized co-rendering of the media contents is still achieved. This is possible because the correct time alignment of the track fragments and portions of media content in the subsequent media container file h + 1 can be determined based on the durations or lengths of the track fragments and portions of media content in the container file. subsequent media h if they were correctly aligned in time. Track fragment adjustment information thus, in particular, becomes important for obtaining synchronized rendering when tuning or randomly accessing a continuous stream of subsequent media container files and in which no previous media container files have been received. Step S61 thus involves the co-rendering of the media content portions defined by the media tracks in the second subsequent media container files during the media presentation and in which the co-rendering is performed regardless of any fragment fragment adjustment information. track contained in the second subsequent media container file. The method then ends or continues to step S60 to receive another subsequent media container file.
Figure 13 is a flowchart illustrating an embodiment of the co-rendering step of Figure 12. The method continues from step S41 in Figure 12. A next step S50 retrieves an indication of the number of adjustment segments for a track fragment and so for the portion of media content with which the track fragment is associated. This indication, in which it is denoted as EC for the entry count in Figure 13, is retrieved in step S50 from the track fragment adjustment information in the track fragment. A next step S51 configures a counter k equal to one to indicate the first adjustment segment. A next step S52 checks if all. 30 adjustment segments were processed and if counter k reached the number retrieved in step S50, that is, if k = EC, if the last adjustment segment is reached, the method ends or continues to process a next fragment of track. However, it may be preferred to process multiple track fragments in a media container file in parallel. In such a case, the steps of the method in Figure 13 are conducted basically in parallel for the track fragments in the subsequent media container file. If serial processing is performed instead, the method continues from step S52 to ETTA step S50, but with respect to the next track fragment in the subsequent media container file.
The next step S53 retrieves a time indication for the current adjustment segment number k and checks whether the time indication has a predefined value. representing a so-called void edition.
In the figure, this time indication is, 5 represented by MT (k) indicating the media time for setting segment number k. e The default value is preferably -1. If MT (k) = -1 the method continues to IS step S54, the rendering of the media content associated with the track fragment adjustment information is delayed for a period of time corresponding to the current adjustment segment number duration k.
This duration is defined by a length of time retrieved from the track fragment adjustment information in step S54. The time duration was represented by SD (k) in the figure, representing the segment duration for the setting segment number k If the time indication does not have the default value of -1, the method continues from step S53. In an optional embodiment, the track fragment adjustment information comprises a rate indication.
The rate indication was represented by MT (k) in the figure denoting the media rate for the adjustment segment number k.
In such a case, the optional step S55 retrieves the rate indication for the current adjustment segment of the track fragment adjustment information and checks whether it has a default value equal to O, indicating a so-called “interruption”. If MT (k) = O the method continues from step S55 to optional step S56, in which the rendering of the relevant media content is paused for the duration of the current adjustment segment as defined based on the SD (k) time duration. However, if the rate indication is not equal to the default value of 0, the method continues to step S57 in which the media content is rendered for a period of time corresponding to the duration of the current adjustment segment SD (K). The method then continues from step S54, S56 or S57 to step S58, where counter k is increased by one to indicate that the next adjustment segment must be: processed.
The method continues to step S52. The MT (k) time indication of the track fragment adjustment information is "30 thus used to select whether to render the media content corresponding to the adjustment segment for the duration of time as defined by SD (k) or to render the delay of the media content for a period of time corresponding to SD (k). In another embodiment, the time indication MT (k) can be used to select whether to render media content, render delayed media content, or forward rendering of media content for a period of time defined by SD (Kk). Correspondingly, the SD rate indication (k) of the Tm track fragment adjustment information can be used to select whether to render media content corresponding to the number of adjustment segment k for a period of time defined by SD (k) or pausing the rendering of media content for a period of time corresponding to SD (k). Figure 15 is a schematic block diagram of one user terminal 200 of, 5 according to an embodiment. User terminal 200 is exemplified by a mobile terminal! in the figure. However, this should merely be seen as an illustrative example. The user terminal could be any entity or device or aggregation of multiple devices that has decoding and rendering capabilities. A single such device can be a mobile terminal, such as a mobile phone or laptop, a computer, a digital converter for a television or any other media processing device. The decoding and rendering functionality can be present on different devices that are then able to conduct wired or wireless communication with each other. The user terminal therefore also encompasses a distributed implementation embodiment of various media terminal embodiments.
User terminal 200 comprises a transmitter 210 configured to transmit a request to a media container file. The request could be a random access request, a request tuning or a fetch request to access a streaming media content that is optionally streamed using HTTP. The media container file is an initial media container file as previously discussed and comprises initial tracks for the media content defined by the initial tracks and configured to be co-rendered by user terminal 200 in a media presentation. User terminal 200 also comprises a receiver 210 configured to receive the requested initial media container file and to also receive a subsequent media container file. The receiver and transmitter can be implemented as separate devices or as a common transceiver 210. The subsequent media container file comprises. a respective track fragment for each media content to define respective decoding and rendering instructions for the portions of the media content.: 30 associated with the subsequent media container file.
A media player 220 is implemented at user terminal 200 to co-render the portions of media content defined in the subsequent media container file received by the receiver 210. This co-rendering by the media player is performed based on the instructions defined in the fragments track and based on rendering synchronization relationships defined by the adjustment information of the track fragment contained in one or more track fragments of the subsequent media container file. The co-rendered media data could then be displayed on a 230 monitor screen or be connected to user terminal 200 and played back on a speaker (not shown) from or connected to the user terminal
200. - User terminal 200 preferably also comprises a f 5 decoder (not shown) which can form part of media player 220 and configured to * decode media content preferably based on track fragments in the subsequent media container file.
In one embodiment, user terminal 200 comprises an optional metadata retriever 250. Metadata retriever 250 is configured to retrieve an indication of the number of adjustment segments in the relevant media content portion, the length of time for each adjustment segment and a respective time indication for each adjustment segment of the track fragment adjustment information. An optional player controller 240 is implemented to control the operation of the media player 220 based on the time indication of a current adjustment segment.
The player controller 240 then controls the media player 220 for rendering the media content corresponding to the adjustment segment for a period of time defined by the duration of the adjustment segment, media player control to advance rendering for a period of time defined by the duration of time or delay of rendering for a period of time defined by the duration of time depending on the time indication value.
In another embodiment, the metadata retriever 250 additionally retrieves a rate indication for each adjustment segment of the track fragment adjustment information. In such a case, the player controller additionally controls the media player 220 based on the rate indication in addition to the time indication. The rate indication can then be used to control the media player 220 to render the media content corresponding to the adjustment segment for a period of time defined by the duration of the adjustment segment or to control the media player 220 to pause rendering for a period of time defined by the length of time.
. 30 Track fragment adjustment information is advantageously used only in connection with the receipt of the first subsequent media container file. Thus, when the receiver 210 receives a second and other subsequent media container files from the stream these subsequent media container files have track fragments for the media content and optionally also track fragment adjustment information. However, the media player 220 then co-renders the portions of the media content defined by the second and other subsequent media container files based on the track fragments in this subsequent media container file, but regardless of any setting information of track fragment contained in subsequent media container files. Units 210 to 250 of user terminal 200 can be implemented or. provided as hardware or a combination of hardware and software. In the case of a software-based À 5 implementation, a computer program product * implementing user terminal 200 or a portion thereof includes software: either a computer program execution on a general purpose computer, processor or microprocessor adapted. The software includes code elements from the computer program or portions of the software code illustrated in Figure
15. The program can be stored in whole or in part, or on one or more appropriate non-transitory computer-readable media or data storage media such as magnetic discs, CD-ROMs, DVD discs, USB memories, hard drives , magneto-optical memory, RAM or volatile memory, ROM or flash memory, as firmware, or on a data server.
The embodiments described above are to be understood as some illustrative examples of the present invention. It will be understood by those skilled in the art that various modifications, combinations and changes can be made to the embodiments without departing from the scope of the present invention. In particular, solutions in different parts in different embodiments can be combined in other configurations, where technically possible. The scope of the present invention is, however, defined by the appended claims.

权利要求:
Claims (19)
[1]
1. Method of providing media content for streaming, referred to method characterized by the fact that it comprises: To provide (S1), in a media container file (1) formatted according to the À 5 ISO-based media file format and for each media content of at least: a first media content and a second media content, an initial track (5, 6) comprising an identification of said media content and information from an encoder / decoder employed to encode said content from media; provide (S2) for each media content of said at least first media content and second media content, at least one track fragment (15, 16) comprising the information defining the decoding and rendering instructions for a portion of content of media of said media content; determining (S21) for each adjustment segment a number of adjustment segments defined in a portion of media content of said first media content defined by said track fragment (15), a length of time (22) of said segment adjustment; determining (S22) for each adjustment segment of said number of adjustment segments, a time indication (23) representing a starting time of said adjustment segment within said portion of media content of said first media content or representing an empty time indicating that no rendering of said first media content would occur during the duration of said adjustment segment; and providing (S3), in a track fragment (15) associated with said first media content, track fragment adjustment information (20) applicable to said portion of media content of said first media content and comprising said number (21) of adjustment segments, and said time duration (22) and said time indication (23) for each adjustment segment, in which said track fragment adjustment information (20) defines rendering synchronization relationships between said portion of media content of said first media content and the portion of The corresponding media content of said second media content to be co-rendered with said portion of media content of said first media content during a presentation. media.
[2]
2. Method according to claim 1, characterized by the fact that providing (S3) said track fragment adjustment information (20) further comprises determining (S23), for each adjustment segment of said number of adjustment segments, a rate indication (24) represents a relative rendering fee for media content corresponding to said adjustment segment and has a default value if said media content corresponding to said adjustment segment must be presented for the duration of said adjustment segment and otherwise it has another predefined value, in which defining (S24) said adjustment information of track fragment (20) comprises defining (S24) said adjustment information of track fragment: (20) to understand said number (21 ) of adjustment segments and said time duration (22), said time indication (23) and said rate indication (24) for each adjustment segment.
[3]
3. Method according to claim 1 or 2, characterized by the fact that it still comprises: retrieving (S30) a time stamp of a first media sample in said portion of media content of said first media content and a time stamp of a first media sample in said portion of corresponding media content from said second media content; calculate (S32) a time deviation based on said time stamp of said first media sample on said portion of media content of said first media content and said time stamp of said first media sample on said portion of corresponding media content of said second media content; and determining (S33) said time indication (23) of said trail fragment adjustment information (20) based on said time deviation.
[4]
4. Device (100) for providing media content for streaming, characterized by the fact that said device (100) comprises: a track provider (120) configured to provide, in a media container file (1) formatted according to with the ISO-based media file format and for each media content of at least a first media content and a second media content, an initial track (5, 6) comprising an identification of said content, media and information from an encoder / decoder used to encode said media content; only a fragment provider (130) configured to provide, for each media content of said at least first media content and second media content, at least one track fragment (15, 16) comprising information defining the decoding instructions and rendering for a portion of media content from said media content; and an information provider (140) configured to provide, on a fragment of Er: track (15) associated with said first media content, track fragment fit information (20) applicable to a portion of said media content first media content defined by said fragment of track (15), said fragment provider (130) comprises:; a number definer (141) configured to define a number (21) of adjustment segments in said portion of media content from said first: media content; a duration determiner (142) configured to determine, for each adjustment segment of said number of adjustment segments, a time duration (22) of said adjustment segment; a time indication determiner (143) configured to determine, for each adjustment segment of said number of adjustment segments, a time indication (23) representing a start time of said adjustment segment within said portion of content of media of said first media content or representing an empty time indicating that no rendering of said first media content should occur during the duration of said adjustment segment; and an information definer (145) configured to define said track fragment adjustment information (20) to comprise said number (21) of adjustment segments, and said time duration (22) and said time indication (23) for each adjustment segment, wherein said track fragment adjustment information (20) defines rendering synchronization relationships between said portion of media content of said first media content and the corresponding media content portion of said second media content to be co-rendered with said portion of media content from said first media content during a media presentation.
[5]
5. Device according to claim 4, characterized by the fact that said fragment provider (130) is configured to provide, for each media content of said at least first media content and second media content, r referred by at least one track fragment (15, 16) contained in a subsequent media container file (11), formatted according to the O 180 base media file format
[6]
6. Device according to claim 5, characterized by the fact that it still comprises a transmitter (110) configured to transmit said media container file (1) to a requesting user terminal (200) and continuously transmit said subsequent media container file (11) for said user terminal (200) using the hypertext transport protocol, HTTP.
[7]
7. Device according to any one of claims 4 to 6,
characterized by the fact that said information provider (140) comprises a rate indication determiner (144) configured to determine, for each adjustment segment of said number of adjustment segments, a rate indication (24) to those representing a rate relative rendering speed for media content corresponding to said adjustment segment and has a predefined value if said media content 2 corresponding to said adjustment segment must be presented for the duration of said adjustment segment and otherwise has another predefined value, wherein said information definer (145) is configured to define said track fragment adjustment information (20) to comprise said number (21) of adjustment segments and said time duration (22), said time indication (23 ) and said rate indication (24) for each adjustment segment.
[8]
Device according to any one of claims 4 to 7, characterized in that it further comprises: a time stamp retriever (150) configured to retrieve a time stamp from a first sample of media in said portion of said media content first media content and a time stamp of a first media sample on said portion of corresponding media content on said second media content; a deviation calculator (160) configured to calculate a time deviation based on said time stamp of said first media sample on said portion of media content of said first media content and said time stamp of said first media sample on said corresponding portion of media content of said second media content; and an information determiner (180) configured to determine said time indication (23) of said track fragment adjustment information based on said time offset.
[9]
9. Device according to claim 8, characterized by the fact that it is still. comprise a time stamp translator (170) configured to translate said time stamp of said first media sample into said portion of media content from À 30 said first media content having a first sampling rate and said time stamp of said first sample of media in said portion of corresponding media content of said second media content having a different second sampling rate for a common timescale to form a first translated time stamp and a second translated time stamp, in which said deviation calculator ( 160) is configured to calculate said time offset as a difference between said first translated time stamp and said second translated time stamp.
[10]
10. Device according to claim 9, characterized in that it also comprises a value locator (190) configured to find a minimum or a maximum value of said first translated time stamp and said second translated time stamp, in which said deviation calculator (160) is configured to calculate a first deviation: from time between said minimum value or said maximum value and said first translated temporal sinus and calculate a second time deviation between said minimum value or said maximum value and said second time stamp ! translated; and said information determiner (180) is configured to determine said track fragment adjustment information based on said first time offset and said second time offset.
[11]
Device according to any one of claims 4 to 10, characterized in that said information provider (140) is configured to provide, in said at least one track fragment (15), said adjustment fragment information track (20) in response to the receipt of a random access request by a receiver (110) of said device (100).
[12]
12. Device according to any one of claims 5 to 11, characterized by the fact that said information provider (140) is configured to i) provide, in said fragment of track (15) associated with said first media content, said track fragment adjustment information (20) applicable to said portion of media content of said first media content, and ii) providing, in a track fragment (16) associated with said second media content, adjustment information of track fragment applicable to said portion of corresponding media content of said second media content, wherein said track fragment adjustment information (20) defines rendering synchronization relationships between said portion of media content of said first media content and said portion of corresponding media content from said second media content.
[13]
is 13. Media content rendering method, characterized by the fact that it comprises: requesting and receiving (S40) a media container file (1) formatted according to the ISO-based media file format and comprising, for each content media of at least one first media content and a second media content, a starting track (5, 6) comprising an identifier of said media content and information from an encoder / decoder employed to encode said media content; TA receive (S41) a subsequent media container file (11) formatted according to the ISO base media file format and comprising, for each media content of said at least first media content and said second media content, a track fragment (15, 16) comprising defining information. decoding and rendering instructions for a portion of media content from said media content and track fragment adjustment information (20) contained in: a track fragment (15) associated with said first media content and applicable to a portion media content of said first media content; retrieving (S50), from said track fragment adjustment information (20), an indication (21) of the number of adjustment segments in said portion of media content of said first media content, a duration of time ( 22) for each adjustment segment of said number of adjustment segments and a time indication (23), for each adjustment segment of said number of adjustment segments, representing a start time for said adjustment segment within said portion media content of said first media content or representing an empty time indicating that no rendering of said first media content should occur during the duration of said adjustment segment; and co-rendering (S42) said portion of media content from said first media content and a portion of corresponding media content from said second media content during a media presentation based on said track fragments (15, 16) and based on said track fragment adjustment information (20) defining the rendering synchronization relationships between said portion of media content of said first media content and said portion of corresponding media content of said second media content by selection ( S53) for each adjustment segment of said number of adjustment segments and based on said time indication (23) of said adjustment segment, either to render (S57) the media content corresponding to said adjustment segment of said first content of media in said portion of media content over a period: of time defined by the length of time (22) of said segment of a juste, advance rendering for a period of time defined by said time duration (22) of said adjustment segment or delay (S54) the rendering for a period of time defined by said time duration (22) of said adjustment segment .
[14]
14. Method according to claim 13, characterized by the fact that retrieving (S50) from said trail fragment adjustment information (20) comprises recovering (S50) from said trail fragment adjustment information (20), said indication (21) of number of adjustment segments in said portion of pi media content of first media content, said duration of time (22)
for each adjustment segment of said number of adjustment segments, said time indication (23) for each adjustment segment of said number of adjustment segments and a rate indication (24) for each adjustment segment of said number i adjustment segments, in which the co-rendering (S42) of said portion of media content S and said portion of corresponding media content comprises selecting.: (S55), for each adjustment segment of said number of adjustment segments and based on said rate indication (24) of said adjustment segment, you want to render the media content (S57) corresponding to said adjustment segment in said portion of media content of said first media content for a period of time defined by said time duration (22) of said adjustment segment or pause (S56) rendering for a period of time defined by said time duration (22) of said adjustment segment.
[15]
15. Method according to claim 13 or 14, characterized in that it further comprises: receiving (S60) a second subsequent media container file, formatted according to the ISO based media file format, and comprising, for each media content of said at least first media content and said second media content, a fragment of track comprising information defining decoding and rendering instructions for a second portion of media content of said media content and adjustment fragment information track contained in a fragment of track associated with said first media content and applicable to a second portion of media content from said first media content; and co-rendering (S61) said second portion of media content from said first media content and a second portion of corresponding media content from said second media content during said media presentation based on said track fragments contained in said second subsequent media container file but regardless of said track fragment adjustment information contained in said second subsequent media container file.
[16]
16. User terminal (200), characterized by the fact that it comprises a transmitter (210) configured to transmit a request to a media container file (1) formatted according to the ISO based media file format and comprising, for each media content of at least a first media content and a second media content, a starting track (5, 8)) comprising an identifier of said media content and information from an encoder / decoder employed to encode said media content ; - a receiver (210) configured to receive said media container file
(1) and a subsequent media container file (11), formatted according to the ISO-based media file format and comprising, for each media content of said at least first media content and said second media content, and a track fragment (15, 16) comprising information defining the instructions for decoding and rendering a portion of media content of said content: media and track fragment adjustment information (20) contained in a track fragment ( 15) associated with said first media content and applicable to a portion of media content of said first media content; a metadata retriever (250) configured to retrieve, from said track fragment adjustment information (20), an indication (21) of the number of adjustment segments in said portion of media content of said first media content , a duration of time (22) for each adjustment segment of said number of adjustment segments and a time indication (23), for each adjustment segment of said number of adjustment segments, representing a start time 1 for said adjustment segment within said portion of media content of said first media content or representing an empty time indicating that no rendering of said first media content must occur during the duration of said adjustment segment; a media player (220) configured to co-render said portion of media content from said first media content and a portion of corresponding media content from said second media content during a media presentation based on said track fragments ( 15, 16) and based on said track fragment adjustment information (20) defining rendering synchronization relationships between said portion of media content of said first media content and said portion of corresponding media content of said second content of media; and a player controller (240) configured to, for each adjustment segment 'of said number of adjustment segments and based on said time indication (23) of said adjustment segment, control said media player (220) to render o At 30 media content corresponding to said adjustment segment of said first media content in said portion of media content for a period of time defined by the length of time (22) of said adjustment segment, to control said media player ( 220) to advance rendering for a period of time defined by said time duration (22) of said adjustment segment or to control said media player (220) to delay rendering for a period of time defined by said time duration ( 22) of said adjustment segment.
[17]
17. User terminal according to claim 16, characterized by the fact that said metadata retriever (250) is configured to retrieve, from - said track fragment adjustment information (20), said indication (21) of f 5 number of adjustment segments in said portion of media content of said first media content, said duration of time (22) for each adjustment segment of said number of adjustment segments, said time indication (23) for each adjustment segment of said number of adjustment segments and a rate indication (24) for each adjustment segment of said number of adjustment segments, and said player controller (240) is configured for, for each adjustment segment of said number of adjustment segments and based on said rate indication (24) of said adjustment segment, controlling said media player (220) to render the media content corresponding to said segment adjustment time on said portion of media content of said first media content for a period of time defined by said length of time (22) of said adjustment segment or to control said media player to pause rendering for a period of time defined by said time duration (22) of said adjustment segment.
[18]
18. User terminal according to claims 16 or 17, characterized in that said receiver (210) is configured to receive a second subsequent media container file formatted according to the ISO based media file format and comprising , for each media content of said at least first media content and said second media content, a fragment of the track comprising the information defining the decoding and rendering instructions for a second portion of media content of said media content and information Track fragment adjustment information contained in a track fragment associated with said first media content and applicable to a second portion of media content from said first media content, and said media player (220) is configured to co-render said second portion of media content of said first media content and a second portion of media content c corresponding to said second media content during said media presentation based on said track fragments contained in said subsequent second media container file but regardless of said track fragment adjustment information contained in said subsequent second media container file. eee:
[19]
19. Device according to any one of claims 16 to 18,
characterized by the fact that said transmitter (210) is configured to request said media container file (1) transmitting a random access request for media content continuously transmitted over the hypertext transfer protocol. .
* and ---- . Q TF 7 Yes! ! à Es S & 3 S | (8 2 '[82 258, E 3 ss | / 2s8! | SSB | É2sÃ IS S to 2 18SE OSS to ZE = = DO ã 1ÉBZ ESSES Es x & t VA <z E! E! AZza É [> E = -—— = 3; ç SS = s ns on nn q VV am 2 9 4 8 xo <É Z = A Zz É anos << Estcalz 833 e E HO6AE <a ZESA o os ER 2388 —-a 2 o & õ Z 32 2 2 | sed to the beast: 638] E
FE <2º Ê "= ÕÔ - o = i-> | nn mn VV. Qn 4 = 8« Yes. <An & & <1 â 22 aZ3 | ez Ea & 3 ZE Kas o eo é s2 | | 88s | 342º - à) | g2 | es] gd (2) E 2 É É À =. O -— ee nn nn nn sen | VA: oo -— -> 8 o $ | ls] | <: 3 É is 3 Es 3 ) E is 2 É al e 2) la | za) | El | & s | 18) is <8 e [E ls S al el ls 5 hello Zz | | 2 og ol E je | | 2 // 8 8a â | |) é = | EEE ls z aS | EH m á | 88 | | [S // 2 = | a 3 8 No) a, | j 6 OA 2) Sm ASS mn 8) E) | $ E RE Es z | EE is E PGE FE m RH | s - õ SS ajlê 2 ass | za EIS) & 2 2/2 8 | à 2/2 2 1392 * 7 SEC Y ss * + - <o € <2 e2 |: e EE> =. = Ag - s - a le a 8 = | [e] (3 E) al le a CSENCRSE: É | | E) 4 â 8 is z & É | | z 9 = | ls = â e - <sz> FERNE 818 = o 3 | | º Z = mm o <| = Ee O ao +

类似技术:

公开号 | 公开日 | 专利标题

BR112012010772A2|2020-09-08|method and device for providing streaming media content, media content rendering method, and user terminal

US9253233B2|2016-02-02|Switch signaling methods providing improved switching between representations for adaptive HTTP streaming

CA2807157C|2016-04-19|Manifest file updates for network streaming of coded video data

ES2870552T3|2021-10-27|Procedure and apparatus for transmitting and receiving content based on adaptive Streaming mechanism

US20110307545A1|2011-12-15|Apparatus and Methods for Describing and Timing Representatives in Streaming Media Files

BR112012001150B1|2021-06-29|METHOD FOR IMPLEMENTING HTTP-BASED TRANSMISSION SERVICE

KR20150012206A|2015-02-03|Method and apparatus for encoding three dimensoional content

CN102625193A|2012-08-01|A method of realizing multimedia file network playing by virtue of auxiliary files

BR112020015214A2|2021-01-26|dynamic conditional ad insertion

BR112020014495A2|2020-12-01|dynamic network content processing of an iso bmff network resource range

BR112013002692B1|2021-10-26|METHOD AND DEVICE TO RETRIEVE MULTIMEDIA DATA, METHOD AND DEVICE TO SEND INFORMATION TO MULTIMEDIA DATA, AND COMPUTER-READABLE MEMORY

同族专利:

公开号 | 公开日

WO2011056139A1|2011-05-12|

US20120221741A1|2012-08-30|

US20140186003A1|2014-07-03|

EP2497269A1|2012-09-12|

US8635359B2|2014-01-21|

US9653113B2|2017-05-16|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US20040006575A1|2002-04-29|2004-01-08|Visharam Mohammed Zubair|Method and apparatus for supporting advanced coding formats in media files|

US20040146285A1|2002-05-28|2004-07-29|Yoshinori Matsui|Moving picture data reproducing device with improved random access|

US9380096B2|2006-06-09|2016-06-28|Qualcomm Incorporated|Enhanced block-request streaming system for handling low-latency streaming|

KR101074585B1|2007-07-02|2011-10-17|프라운호퍼-게젤샤프트 추르 푀르데룽 데어 안제반텐 포르슝 에 파우|Apparatus and Method for Storing and Reading a File having a Media Data Container and a Metadata Container|

CN101802823A|2007-08-20|2010-08-11|诺基亚公司|Segmented metadata and indexes for streamed multimedia data|

US8775566B2|2008-06-21|2014-07-08|Microsoft Corporation|File format for media distribution and presentation|

US8904191B2|2009-01-21|2014-12-02|Microsoft Corporation|Multiple content protection systems in a file|

WO2010117316A1|2009-04-09|2010-10-14|Telefonaktiebolaget L M Ericsson |Methods and arrangements for creating and handling media files|

US9917874B2|2009-09-22|2018-03-13|Qualcomm Incorporated|Enhanced block-request streaming using block partitioning or request controls for improved client-side handling|

US20110096828A1|2009-09-22|2011-04-28|Qualcomm Incorporated|Enhanced block-request streaming using scalable encoding|

US8635359B2|2009-11-06|2014-01-21|Telefonaktiebolaget L M Ericsson |File format for synchronized media|

US9185439B2|2010-07-15|2015-11-10|Qualcomm Incorporated|Signaling data for multiplexing video components|

TW201210325A|2010-07-21|2012-03-01|Nokia Corp|Method and apparatus for indicating switching points in a streaming session|US8635359B2|2009-11-06|2014-01-21|Telefonaktiebolaget L M Ericsson |File format for synchronized media|

KR101777348B1|2010-02-23|2017-09-11|삼성전자주식회사|Method and apparatus for transmitting and receiving of data|

EP3518497A1|2010-04-20|2019-07-31|Samsung Electronics Co., Ltd.|Method for transmitting multimedia content|

BE1019349A5|2010-05-25|2012-06-05|H O P|METHOD AND DEVICE FOR STORING AND / OR REPRODUCING SOUND AND IMAGES.|

KR101702562B1|2010-06-18|2017-02-03|삼성전자 주식회사|Storage file format for multimedia streaming file, storage method and client apparatus using the same|

EP2649798A4|2010-12-06|2017-06-07|Oracle International Corporation|Media platform integration system|

WO2011100901A2|2011-04-07|2011-08-25|华为技术有限公司|Method, device and system for transmitting and processing media content|

US9578354B2|2011-04-18|2017-02-21|Verizon Patent And Licensing Inc.|Decoupled slicing and encoding of media content|

US9420259B2|2011-05-24|2016-08-16|Comcast Cable Communications, Llc|Dynamic distribution of three-dimensional content|

US8849819B2|2011-08-05|2014-09-30|Deacon Johnson|System and method for controlling and organizing metadata associated with on-line content|

GB2499643A|2012-02-24|2013-08-28|Canon Kk|Extracting a meta data fragment from a metadata component associated with multimedia data|

US9219929B2|2012-02-27|2015-12-22|Fritz Barnes|Enhanced startup and channel change for fragmented media stream delivery|

US9769368B1|2013-09-25|2017-09-19|Looksytv, Inc.|Remote video system|

CN106105235B|2014-01-09|2020-04-10|三星电子株式会社|Method and apparatus for transmitting media data related information in multimedia transmission system|

US9449640B2|2014-06-03|2016-09-20|Glenn Kreisel|Media device turntable|

US20150358507A1|2014-06-04|2015-12-10|Sony Corporation|Timing recovery for embedded metadata|

US9954570B2|2015-03-30|2018-04-24|Glenn Kreisel|Rotatable device|

CN105611401B|2015-12-18|2018-08-24|无锡天脉聚源传媒科技有限公司|A kind of method and apparatus of video clipping|

US10116719B1|2016-06-03|2018-10-30|Amazon Technologies, Inc.|Customized dash manifest|

US10104143B1|2016-06-03|2018-10-16|Amazon Technologies, Inc.|Manifest segmentation|

US10432690B1|2016-06-03|2019-10-01|Amazon Technologies, Inc.|Manifest partitioning|

CN108124169A|2016-11-29|2018-06-05|中国科学院声学研究所|A kind of P2P Video service accelerated methods of household radio router|

US10939086B2|2018-01-17|2021-03-02|Mediatek Singapore Pte. Ltd.|Methods and apparatus for encoding and decoding virtual reality content|

US10944977B2|2018-04-03|2021-03-09|Mediatek Singapore Pte. Ltd.|Methods and apparatus for encoding and decoding overlay compositions|

US10869016B2|2018-04-12|2020-12-15|Mediatek Singapore Pte. Ltd.|Methods and apparatus for encoding and decoding virtual reality content|

CN110881018B|2018-09-05|2020-11-03|北京开广信息技术有限公司|Real-time receiving method and client of media stream|

法律状态:
2020-10-06| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|

2020-11-17| B08F| Application dismissed because of non-payment of annual fees [chapter 8.6 patent gazette]|Free format text: REFERENTE A 10A ANUIDADE. |

2021-03-09| B08K| Patent lapsed as no evidence of payment of the annual fee has been furnished to inpi [chapter 8.11 patent gazette]|Free format text: EM VIRTUDE DO ARQUIVAMENTO PUBLICADO NA RPI 2602 DE 17-11-2020 E CONSIDERANDO AUSENCIA DE MANIFESTACAO DENTRO DOS PRAZOS LEGAIS, INFORMO QUE CABE SER MANTIDO O ARQUIVAMENTO DO PEDIDO DE PATENTE, CONFORME O DISPOSTO NO ARTIGO 12, DA RESOLUCAO 113/2013. |

2021-11-23| B350| Update of information on the portal [chapter 15.35 patent gazette]|

优先权:

申请号 | 申请日 | 专利标题

US25865409P| true| 2009-11-06|2009-11-06|

US61/258,654|2009-11-06|

PCT/SE2010/051209|WO2011056139A1|2009-11-06|2010-11-05|File format for synchronized media|

[返回顶部]